Combining information extraction and data integration in the estest system
نویسندگان
چکیده
We describe an approach which builds on techniques from Data Integration and Information Extraction in order to make better use of the unstructured data found in application domains such as the Semantic Web which require the integration of information from structured data sources, ontologies and text. We describe the design and implementation of the ESTEST system which integrates available structured and semi-structured data sources into a virtual global schema which is used to partially configure an information extraction process. The information extracted from the text is merged with this virtual global database and is available for query processing over the entire integrated resource. As a result of this semantic integration, new queries can now be answered which would not be possible from the structured and semi-structured data alone. We give some experimental results from the ESTEST system in use.
منابع مشابه
The ESTEST System - Combining Data Integration and Information Extraction
We describe an approach which combines techniques from Data Integration and Information Extraction in order to make better use of the unstructured data found in applications built over databases containing both structured data and text. We contrast this approach to similar work and then give details of the implementation of our ESTEST system. ESTEST integrates available data sources into a glob...
متن کاملCombining Data Integration and Information Extraction Techniques
We describe a class of applications which are built using databases comprising some structured data and some free text. Conventional database management systems have proved ineffective for these applications and they are rarely suitable for current text and data mining techniques. We argue that combining Information Extraction and Data Integration techniques is a promising direction for researc...
متن کاملCombining Database and Information Extraction Techniques to Discover Structure From Partially Structured Data
This paper shows how Information Extraction and Semantic Web Ontology technologies can be combined with information integration techniques in the AutoMed framework to extend the facilities provided by databases for handling free text data. This paper gives a design for a demonstrator system ESTEST (Experimental Software to Extract Structure from Text). This design has several novel features. In...
متن کاملCombining data integration and information extraction
Abstract Improving the ability of computer systems to process text is a significant research challenge. Many applications are based on partially structured databases, where structured data conforming to a schema is combined with free text. Information is stored as text in these applications because the queries requiredImproving the ability of computer systems to process text is a significant re...
متن کاملMapping the Potential of Groundwater Resources in Hard Formations Using Geographic Information System and Remote Sensing, Case Study: Northwest of Shahroud
In recent years, rapid population growth has led to increase per capita water use in various sectors including agriculture and industry and a growing gap between water demand and water supply has emerged. Therefore, identifying and tracking changes in groundwater resources as an alternative and reliable source of surface water resources are so important to region located in the Middle East with...
متن کامل